Skip to content

Conversation

@ela-kotulska-frequenz
Copy link
Contributor

@ela-kotulska-frequenz ela-kotulska-frequenz commented May 22, 2025

When an actor threw an exception during cancellation,
it restarted rather than stopping as expected.
Now it stops and raises and just prints exception.

@ela-kotulska-frequenz ela-kotulska-frequenz self-assigned this May 22, 2025
@Copilot Copilot AI review requested due to automatic review settings May 22, 2025 09:47
@ela-kotulska-frequenz ela-kotulska-frequenz added the type:bug Something isn't working label May 22, 2025
@ela-kotulska-frequenz ela-kotulska-frequenz requested a review from a team as a code owner May 22, 2025 09:47
@github-actions github-actions bot added part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests part:actor Affects an actor ot the actors utilities (decorator, etc.) labels May 22, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where actors would restart instead of stopping when exceptions occurred during cancellation. Key changes include the introduction of a new actor subclass that raises an exception during cancellation, updates to the actor’s internal logic to properly handle cancelled state, and a new test that verifies the correct stopping behavior.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/actor/test_actor.py New actor subclass and test to check actor stops on cancellation.
src/frequenz/sdk/actor/_actor.py Updates in actor startup and exception handling during cancellation.
RELEASE_NOTES.md Updated release notes to describe the bug fix.

When an actor throws an exception during cancellation,
it restarts rather than stopping as expected.

Signed-off-by: Elzbieta Kotulska <[email protected]>
@ela-kotulska-frequenz ela-kotulska-frequenz force-pushed the fix_error_loop branch 2 times, most recently from 51c6e89 to 1045eea Compare May 22, 2025 09:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug where an actor restarted instead of stopping when an exception was raised during cancellation. Key changes include the addition of a new test case in tests/actor/test_actor.py, an adjustment in cancellation and exception logging logic in src/frequenz/sdk/actor/_actor.py, and updated release notes.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
tests/actor/test_actor.py Added a new test and a specialized actor (RaiseExceptionOnCancelActor) to validate proper stop behavior upon cancellation errors.
src/frequenz/sdk/actor/_actor.py Introduced a cancellation flag, refined exception handling in the run loop, and updated the cancel method to prevent actor restart.
RELEASE_NOTES.md Updated the bug fix section to reflect that actors now stop and surface the unhandled exception during cancellation.

Copy link
Contributor

@llucax llucax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sketchy to me, it looks like supporting a random exception to be raised during cancellation is just a fix for a but we just created by raising this exception.

name: The name of this background service.
"""
super().__init__(name=name)
self._is_cancelled = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we save the main actor task instead and use self._main_task.cancelled() as a single source of truth?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked and in that case self._main_task.cancelled() returns False, because task was not cancelled successfully.

Comment on lines +77 to +78
except asyncio.CancelledError as exc:
raise RuntimeError("Actor should stop.") from exc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need to support this, this feels wrong, if a task is cancelled, it should not raise some other exception. Cancellation should always be supported and do nothing if it was already cancelled.

Copy link
Contributor Author

@ela-kotulska-frequenz ela-kotulska-frequenz May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a task is cancelled, it should not raise some other exception.

Yes but bugs happens and anything can raise exception. The problem is in this case we restart the actor and await actor.stop() never ends... :/

When an actor threw an exception during cancellation,
it restarted rather than stopping as expected.
Now it stops and raises an unhandled exception.

Signed-off-by: Elzbieta Kotulska <[email protected]>
@github-project-automation github-project-automation bot moved this from To do to Review approved in Python SDK Roadmap May 26, 2025
@shsms shsms mentioned this pull request May 26, 2025
@ela-kotulska-frequenz ela-kotulska-frequenz added this pull request to the merge queue May 26, 2025
Merged via the queue into frequenz-floss:v1.x.x with commit 4b5d908 May 26, 2025
5 checks passed
@ela-kotulska-frequenz ela-kotulska-frequenz deleted the fix_error_loop branch May 26, 2025 13:23
@github-project-automation github-project-automation bot moved this from Review approved to Done in Python SDK Roadmap May 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

part:actor Affects an actor ot the actors utilities (decorator, etc.) part:docs Affects the documentation part:tests Affects the unit, integration and performance (benchmarks) tests type:bug Something isn't working

Projects

Development

Successfully merging this pull request may close these issues.

3 participants